# Masked Language Modeling
Llm Jp Modernbert Base
Apache-2.0
A Japanese large language model based on the modernBERT-base architecture, supporting a maximum sequence length of 8192, trained on 3.4TB of Japanese corpus
Large Language Model
Transformers Japanese

L
llm-jp
1,398
5
Duo Distilled
Apache-2.0
DUO is a pre-trained model for text generation, which can be used for masked language modeling tasks. It is trained on the OpenWebText corpus and has good performance.
Large Language Model
Transformers English

D
s-sahoo
98.21k
1
Moderncamembert Cv2 Base
MIT
A French language model pre-trained on 1 trillion high-quality French texts, the French version of ModernBERT
Large Language Model
Transformers French

M
almanach
232
2
Duo
Apache-2.0
DUO is a pretrained model based on the Transformers library, focusing on masked language modeling tasks and suitable for the field of natural language processing.
Large Language Model
Transformers English

D
s-sahoo
212
1
Camembertv2 Base
MIT
CamemBERTv2 is a French language model pre-trained on a 275 billion-word French text corpus, serving as the second-generation version of CamemBERT. It adopts the RoBERTa architecture with optimized tokenizer and training data.
Large Language Model
Transformers French

C
almanach
1,512
11
Roberta Kaz Large
A Kazakh language model based on the RoBERTa architecture, trained from scratch using RobertaForMaskedLM, suitable for Kazakh text processing tasks.
Large Language Model
Transformers Other

R
nur-dev
93
3
Codeberta Small V1
CodeBERTa is a code understanding model based on the RoBERTa architecture, specifically trained for multiple programming languages, capable of efficiently handling code-related tasks.
Large Language Model
Transformers Other

C
claudios
16
1
Saudibert
SaudiBERT is the first large-scale pre-trained language model specifically focused on Saudi dialect text, trained on a massive corpus of Saudi tweets and forum posts.
Large Language Model
Transformers Arabic

S
faisalq
233
6
Multilingual Albert Base Cased 64k
Apache-2.0
A multilingual ALBERT model pretrained with masked language modeling (MLM) objective, supporting 64k vocabulary size and case sensitivity
Large Language Model
Transformers Supports Multiple Languages

M
cservan
52
1
Nasa Smd Ibm V0.1
Apache-2.0
Indus is an encoder-only Transformer model based on RoBERTa, specifically optimized for NASA Science Mission Directorate (SMD) applications, suitable for scientific information retrieval and intelligent search.
Large Language Model
Transformers English

N
nasa-impact
631
33
Albertina 100m Portuguese Ptbr Encoder
MIT
Albertina 100M PTBR is a foundational large language model for Brazilian Portuguese, belonging to the BERT family of encoders, based on the Transformer neural network architecture, and developed on the DeBERTa model.
Large Language Model
Transformers Other

A
PORTULAN
131
7
Albertina 100m Portuguese Ptpt Encoder
MIT
Albertina 100M PTPT is a foundational large language model for European Portuguese (Portugal), belonging to the BERT family of encoders. It is based on the Transformer neural network architecture and developed upon the DeBERTa model.
Large Language Model
Transformers Other

A
PORTULAN
171
4
Bert Mlm Medium
A medium-sized BERT language model using Masked Language Modeling (MLM) as the pre-training objective.
Large Language Model
Transformers

B
aajrami
14
0
My Awesome Eli5 Mlm Model
Apache-2.0
Model fine-tuned based on distilroberta-base, specific purpose not clearly stated
Large Language Model
Transformers

M
stevhliu
425
1
Esm2 T48 15B UR50D
MIT
ESM-2 is a state-of-the-art protein model trained with masked language modeling objectives, suitable for fine-tuning on various protein sequence tasks.
Protein Model
Transformers

E
facebook
20.80k
20
Finbert Pretrain
FinBERT is a BERT model pretrained on financial communication texts, specifically designed for financial natural language processing tasks.
Large Language Model
Transformers Other

F
FinanceInc
23
10
K 12BERT
Apache-2.0
K-12BERT is a BERT model obtained through continuous pretraining on K-12 basic education data, optimized specifically for educational scenarios
Large Language Model
Transformers English

K
vasugoel
50
9
Astrobert
MIT
A language model specifically designed for astronomy and astrophysics, developed by the NASA/ADS team, supporting masked token filling, named entity recognition, and text classification tasks.
Large Language Model
Transformers English

A
adsabs
215
14
Albert Base V2 Attribute Correction Mlm
Apache-2.0
This model is based on albert-base-v2's masked language model, specifically fine-tuned for electronic product attribute correction tasks
Large Language Model
Transformers

A
ksabeh
14
0
Roberta Large Japanese
A large Japanese RoBERTa model pretrained on Japanese Wikipedia and the Japanese portion of CC-100, suitable for Japanese natural language processing tasks.
Large Language Model
Transformers Japanese

R
nlp-waseda
227
23
Legalbert Large 1.7M 2
A RoBERTa model pretrained on English legal and administrative texts, specializing in language understanding tasks in the legal domain
Large Language Model
Transformers English

L
pile-of-law
701
63
Legalbert Large 1.7M 1
A BERT large model pretrained on English legal and administrative texts using RoBERTa pretraining objectives
Large Language Model
Transformers English

L
pile-of-law
120
14
Efficient Mlm M0.15 801010
A RoBERTa model employing pre-layer normalization technology, studying the impact of masking ratio in masked language modeling
Large Language Model
Transformers

E
princeton-nlp
114
0
TOD XLMR
MIT
TOD-XLMR is a multilingual task-oriented dialogue model developed based on XLM-RoBERTa, employing a dual-objective joint training strategy to enhance dialogue understanding capabilities
Large Language Model
Transformers Other

T
umanlp
54
2
Roberta Large Ernie2 Skep En
SKEP (Sentiment Knowledge Enhanced Pre-training) was proposed by Baidu in 2020, specifically designed for sentiment analysis tasks. The model incorporates multi-type knowledge through sentiment masking techniques and three sentiment pre-training objectives.
Large Language Model
Transformers English

R
Yaxin
29
2
Bert Base Uncased Mlm
Apache-2.0
A masked language model (MLM) fine-tuned based on bert-base-uncased, suitable for text filling tasks
Large Language Model
Transformers

B
wypoon
25
0
Testmodel
Apache-2.0
BERT is a transformer model pre-trained on large-scale English corpora through self-supervised learning, utilizing masked language modeling and next sentence prediction objectives
Large Language Model
Transformers English

T
sramasamy8
21
0
Bert Base Frozen Generics Mlm
This model freezes all pre-trained BERT weights except the last layer, focusing on masked language modeling tasks to explore the model's generalization ability in processing quantitative statements.
Large Language Model
B
sello-ralethe
17
0
Bert Base Indonesian 522M
MIT
A BERT base model pretrained on Indonesian Wikipedia using Masked Language Modeling (MLM) objective, case insensitive.
Large Language Model Other
B
cahya
2,799
25
Roberta Base Japanese
A Japanese RoBERTa-based pretrained model, trained on Japanese Wikipedia and the Japanese portion of CC-100.
Large Language Model
Transformers Japanese

R
nlp-waseda
456
32
Tapas Small
Apache-2.0
TAPAS is a Transformer-based table question answering model pre-trained in a self-supervised manner on Wikipedia tables and associated text, supporting table understanding and question answering tasks.
Large Language Model
Transformers English

T
google
41
0
Dummy Unknown
This is a dummy RoBERTa model for unit testing and continuous integration, primarily used for demonstration and testing purposes
Large Language Model
D
julien-c
94.30k
0
Tapas Base Masklm
TAPAS (Table Parsing) is a pre-trained language model developed by Google specifically for handling table-related tasks.
Large Language Model
Transformers

T
google
148
0
Politbert
PolitBERT is a BERT model specifically designed for analyzing speeches, interviews, and press conference content of political figures in English-speaking countries.
Large Language Model
P
maurice
28
1
Tapas Tiny Masklm
TAPAS is a table-based pretrained language model specifically designed for tasks related to tabular data.
Large Language Model
Transformers

T
google
16
0
Debertav2 Base Uncased
Apache-2.0
BERT is a pre-trained language model based on the Transformer architecture, trained on English corpus through masked language modeling and next sentence prediction tasks.
Large Language Model English
D
mlcorelib
21
0
Chinese Roberta L 8 H 256
A Chinese RoBERTa model pretrained on CLUECorpusSmall, with 8 layers and 512 hidden units, suitable for various Chinese NLP tasks.
Large Language Model Chinese
C
uer
15
1
Albert Large Arabic
Arabic pretrained version of ALBERT large model, trained on approximately 4.4 billion words of Arabic corpus
Large Language Model
Transformers Arabic

A
asafaya
45
1
Tapas Tiny
Apache-2.0
TAPAS is a Transformer-based table question answering model pre-trained in a self-supervised manner on English Wikipedia table data, supporting table QA and entailment tasks.
Large Language Model
Transformers English

T
google
44
0
Muppet Roberta Base
MIT
A large-scale multi-task representation model achieved through pre-finetuning, based on the RoBERTa architecture, outperforming the original roberta-base on GLUE and question answering tasks
Large Language Model
Transformers English

M
facebook
425
6
- 1
- 2
Featured Recommended AI Models